AITopics | update scheme

These have the opposite properties, with DRL having good sample efficiencyandpoor stability, while ESbeing vice versa. Recently,there havebeen attempts tocombine these algorithms, butthesemethods fullyrelyonsynchronous updatescheme, making it not ideal to maximize the benefits of the parallelism in ES.

evolutionary algorithm, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)

Add feedback

An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search

Neural Information Processing SystemsDec-24-2025, 04:23:49 GMT

Deep reinforcement learning (DRL) algorithms and evolution strategies (ES) have been applied to various tasks, showing excellent performances. These have the opposite properties, with DRL having good sample efficiency and poor stability, while ES being vice versa. Recently, there have been attempts to combine these algorithms, but these methods fully rely on synchronous update scheme, making it not ideal to maximize the benefits of the parallelism in ES. To solve this challenge, asynchronous update scheme was introduced, which is capable of good time-efficiency and diverse policy exploration. In this paper, we introduce an Asynchronous Evolution Strategy-Reinforcement Learning (AES-RL) that maximizes the parallel efficiency of ES and integrates it with policy gradient methods. Specifically, we propose 1) a novel framework to merge ES and DRL asynchronously and 2) various asynchronous update methods that can take all advantages of asynchronism, ES, and DRL, which are exploration and time efficiency, stability, and sample efficiency, respectively. The proposed framework and update methods are evaluated in continuous control benchmark work, showing superior performance as well as time efficiency compared to the previous methods.

efficient asynchronous method, evolutionary and gradient-based policy search, integrating evolutionary, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)

Add feedback

An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search

Neural Information Processing SystemsOct-3-2025, 05:57:31 GMT

These have the opposite properties, with DRL having good sample efficiency and poor stability, while ES being vice versa.

evolutionary algorithm, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

90c56c77c6df45fc8e556a096b7a2b2e-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-17-2025, 00:37:05 GMT

accuracy, classifier, update scheme, (16 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.32)

Add feedback

CM-Ready: Figure 1 - mbv2-035-74KB (10/09) IR: Updated int8 BP (10/09)

Neural Information Processing SystemsAug-17-2025, 00:37:01 GMT

On-device training enables the model to adapt to new data collected from the sensors by fine-tuning a pre-trained model.

accuracy, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Spectral-factorized Positive-definite Curvature Learning for NN Training

Lin, Wu, Dangel, Felix, Eschenhagen, Runa, Bae, Juhan, Turner, Richard E., Grosse, Roger B.

arXiv.org Machine LearningFeb-10-2025

Many training methods, such as Adam(W) and Shampoo, learn a positive-definite curvature matrix and apply an inverse root before preconditioning. Recently, non-diagonal training methods, such as Shampoo, have gained significant attention; however, they remain computationally inefficient and are limited to specific types of curvature information due to the costly matrix root computation via matrix decomposition. To address this, we propose a Riemannian optimization approach that dynamically adapts spectral-factorized positive-definite curvature estimates, enabling the efficient application of arbitrary matrix roots and generic curvature learning. We demonstrate the efficacy and versatility of our approach in positive-definite matrix optimization and covariance adaptation for gradient-free optimization, as well as its efficiency in curvature learning for neural net training.

artificial intelligence, machine learning, update scheme, (15 more...)

arXiv.org Machine Learning

2502.06268

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search

Neural Information Processing SystemsOct-10-2024, 12:57:54 GMT

Deep reinforcement learning (DRL) algorithms and evolution strategies (ES) have been applied to various tasks, showing excellent performances. These have the opposite properties, with DRL having good sample efficiency and poor stability, while ES being vice versa. Recently, there have been attempts to combine these algorithms, but these methods fully rely on synchronous update scheme, making it not ideal to maximize the benefits of the parallelism in ES. To solve this challenge, asynchronous update scheme was introduced, which is capable of good time-efficiency and diverse policy exploration. In this paper, we introduce an Asynchronous Evolution Strategy-Reinforcement Learning (AES-RL) that maximizes the parallel efficiency of ES and integrates it with policy gradient methods.

efficient asynchronous method, evolutionary and gradient-based policy search, integrating evolutionary, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)

Add feedback

Differentially Private Neural Network Training under Hidden State Assumption

Chen, Ding, Liu, Chen

arXiv.org Artificial IntelligenceJul-11-2024

We present a novel approach called differentially private stochastic block coordinate descent (DP-SBCD) for training neural networks with provable guarantees of differential privacy under the hidden state assumption. Our methodology incorporates Lipschitz neural networks and decomposes the training process of the neural network into sub-problems, each corresponding to the training of a specific layer. By doing so, we extend the analysis of differential privacy under the hidden state assumption to encompass non-convex problems and algorithms employing proximal gradient descent. Furthermore, in contrast to existing methods, we adopt a novel approach by utilizing calibrated noise sampled from adaptive distributions, yielding improved empirical trade-offs between utility and privacy.

neural network, noise, privacy loss, (12 more...)

arXiv.org Artificial Intelligence

2407.08233

Country: